Herme the Collection of Natural Speech Data
نویسنده
چکیده
This paper describes our approach to the collection of ‘natural’ (i.e., representative) data from spoken interactions in a social setting in the context of the development (through time) of expressive speech synthesis. Over the past ten years or so, we have collected several corpora of unprompted social conversations that illustrate the ‘contact’ element of speech that was lacking in many of the corpora collected by use of a specific ‘task’ with paid participants. The paper discusses the technical and ethical issues of collecting such spoken material, and highlights some of the problems we have encountered in the processing of this much-needed data. Through the use of attractive conversational devices, we have found that natural human curiosity, and an element of social programming combine to provide us with a rich source of material that complements the task-based collections from paid informants.
منابع مشابه
A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملGadamer’s Ambivalence toward the Enlightenment Project
This essay explores Gadamer’s ambivalent relationship with modernity. Gadamer is a prominent critic of the Enlightenment project. His criticisms are both theoretical and practical. Theoretically, representationalism is at the center of modern epistemology for Gadamer. Practically, Gadamer sees the demotion of prudence (phronesis) as fundamental to the “bad” Enlightenment. Gadamer’s attempt...
متن کاملMulti-Site Data Collection for a Spoken Language Corpus
This paper describes a recently collected spoken language corpus for the ATIS (Air Travel Information System) domain. This data collection effort has been co-ordinated by MADCOW (Multi-site ATIS Data COllection Working group). We summarize the motivation for this effort, the goals, the implementation of a multi-site data collection paradigm, and the accomplishments of MADCOW in monitoring the c...
متن کاملA Contrastive Study of Request Speech Act in English and Persian Novels: Natural Semantic Metalanguage Approach
The Natural Semantic Metalanguage (NSM) Approach claims that there are some universalities in all languages. Speech acts seem to be present in all languages, but considering this approach, research has not indicated whether request speech act differs from one language to another. Thus, this study intended to investigate whether request strategies are used differently in English and Persian roma...
متن کاملA Contrastive Study of Request Speech Act in English and Persian Novels: Natural Semantic Metalanguage Approach
The Natural Semantic Metalanguage (NSM) Approach claims that there are some universalities in all languages. Speech acts seem to be present in all languages, but considering this approach, research has not indicated whether request speech act differs from one language to another. Thus, this study intended to investigate whether request strategies are used differently in English and Persian roma...
متن کامل